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Abstract —Motivated by the need to secure cyber-physical 
systems against attacks, we consider the problem of estimating 
the state of a noisy linear dynamical system when a subset of 
sensors is arbitrarily corrupted by an adversary. We propose a 
secure state estimation algorithm and derive (optimal) bounds 
on the achievable state estimation error. In addition, as a result 
of Independent interest, we give a coding theoretic interpretation 
for prior work on secure state estimation against sensor attacks 
in a noiseless dynamical system. 

I. Introduction 

Cyber-physical systems (CPS) manage the vast majority of 
today’s critical infrastructure and securing such CPS against 
malicious attacks is a problem of growing importance 111. As 
a stepping stone towards securing complex CPS deployed in 
practice, several recent works have studied security problems 
in the context of linear dynamical systems m, El, a, 111,0, 
0 leading to a fundamental understanding of how the system 
dynamics can be leveraged for security guarantees. With this 
motivation, in this paper we focus on securely estimating the 
state of a linear dynamical system from a set of noisy and 
maliciously corrupted sensor measurements. We restrict the 
sensor attacks to be sparse in nature, i.e., an adversary can 
arbitrarily corrupt a subset of sensors in the system. 

Prior work related to secure state estimation against sensor 
attacks in linear dynamical systems can be broadly categorized 
into three classes depending on the noise model for sensor 
measurements: 1) noiseless 2) bounded non-stochastic noise, 
and 3) Gaussian noise. Eor the noiseless setting, the work 
reported in m, El, a shows that, under a strong notion 
of observability, sensor attacks (modeled as a sparse attack 
vector) can always be detected and isolated, and hence the 
state of the system can be exactly estimated. In contrast, when 
the sensor measurements are affected by noise as well as 
maliciously corrupted, the problem of distinguishing between 
noise and attack vector arises. Results reported in 0, 0, 0 
are representative of the second class: bounded non-stochastic 
noise. They provide sufficient conditions for distinguishing the 
sparse attack vector from bounded noise but do not guarantee 
the optimality of their estimation algorithm. The work reported 
in this paper falls in the third class: Gaussian noise. Prior 
work in this class includes 0, a, CQI, CD- In a, the 
analysis is restricted to detecting a class of sensor attacks 
called replay attacks {i.e., attacks in which legitimate sensor 
outputs are replaced with outputs from previous time instants). 
In a, the authors focus on the performance degradation of 
a scalar Kalman filter {i.e., scalar state and a single sensor) 
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when the sensor is under attack. Since they consider a single 
sensor setup, attack sparsity across multiple sensors is not 
studied, and in addition, they focus on an adversary whose 
objective is to degrade the estimation performance and stay 
undetected at the same time (thereby restricting the class of 
sensor attacks). In ifTOl and im, robustification approaches for 
state estimation against sparse sensor attacks are proposed, but 
they lack optimality guarantees against arbitrary sensor attacks. 

In contrast to prior work in the Gaussian noise setup, 
we consider a general linear dynamical system and give 
(optimal) guarantees on the achievable state estimation error 
against arbitrary sensor attacks. The following toy example is 
illustrative of the nature of the problem addressed in this paper 
and some of the ideas behind our solution. 

Example 1: Consider a linear dynamical system with a 
scalar state x{t) such that .r(f-|-1) = x{t) -\-w{t), where w(f) 
is the process noise following a Gaussian distribution with 
zero mean and is instantiated i.i.d. over time. The system has 
three sensors (indexed by d) with outputs yd{t) =x{t) + Vd{t), 
where Vd{t) is the sensor noise at sensor d. Similarly to the 
process noise, Vd{t) is Gaussian distributed with zero mean 
and is instantiated i.i.d. over time. The sensor noise is also 
independent across sensors. Now, consider an adversary which 
can attack any one of the sensors and arbitrarily change its 
output. In the absence of sensor noise, it is trivial to detect 
such an attack since the two good sensors (not attacked by the 
adversary) will have the same output. Hence, a majority based 
rule on the outputs leads to the exact state. However, in the 
presence of sensor noise, even the good sensors may not have 
the same output and a simple majority based rule cannot be 
used for estimation. In this paper, we build on the intuition that 
we may still be able to identify sensors whose outputs can lead 
to a good state estimate by leveraging the noise statistics over 
a large enough time window. In particular, our approach for 
this example would be to hypothesize a subset of two sensors 
as good, and then check whether the outputs from the two 
sensors are consistent with the Kalman state estimate based 
on outputs from the same subset of sensors. Furthermore, we 
show in this paper that such an approach leads to the optimal 
state estimation error for the given adversarial setup. 

In this paper, we generalize the Kalman filter based ap¬ 
proach in the above example to a general linear dynamical 
system with sensor and process noise. Our main contributions 
can be listed as follows: 

• We give optimal guarantees on the achievable state 
estimation error against arbitrary sensor attacks and 
propose an algorithm to achieve the same guarantees; 



• As a result of independent interest, we give a cod¬ 
ing theoretic interpretation (alternate proof) for the 
necessary and sufficient conditions for secure state 
estimation in the absence of noise d, d, 0 (known 
as the sparse observability condition). 

The remainder of this paper is organized as follows. 
Section [I^ deals with the setup. The main results are stated 
in Section [I^ Section [I^ considers the simpler setting of a 
scalar state and illustrates the main ideas behind our estimation 
algorithm and Sectionj^considers its generalization to a vector 
state. Finally, we discuss the coding theoretic view of the 
sparse observability condition d in Section [Vl| 

II. Setup 


A. System model 

We consider a linear dynamical system with sensor attacks 
as shown below; 


x(f + l) =Ax(f)+w(f), y(f) = Cx(f)-fv(f)+0(f), (1) 


where x(f) G K" denotes the state of the plant at time f € N, 
w(f) S K" denotes the process noise at time f, y(f) G R'’ de¬ 
notes the output of the plant at time f and v(f) G denotes the 
sensor noise at time f. The process noise w(f) ~ ^ (OjCJ^I,,), 
i.e., w(f) is Gaussian distributed with zero mean and covari¬ 
ance matrix (7^I„, where I„ is the identity matrix of dimension 
n and G K. Similarly, sensor noise v(f) ~ 

Both v(f) and w(t) are instantiated i.i.d. over time, and v(f) is 
independent of w(f). 

The sensor attack vector ^(f) G K'’ in Q is introduced 
by a k-adversary defined as follows. A k-adversary has access 
to any k out of the p sensors in the system. Specifically, let 
K C {1,2,.. .p} denote the set of attacked sensors (with |k:| 
k). The k-adversary can observe the actual outputs in the k 
attacked sensors and change them arbitrarily. Specifically, the 
output of an attacked sensor j G K can be expressed as 

yj{t)=c]x{t) + Vj{t) + (^j{t), (2) 


where T denotes the matrix transpose operation, cj is the 
y'th row of C, Vj{t) is the noise at sensor j and ^j{t) is 
the adversarial corruption introduced at sensor j. For j ^ K, 
^j{t) = 0. The adversary’s choice of K is unknown but is 
assumed to be constant over time (static adversary). The 
adversary is assumed to have unbounded computational power, 
and knows the system parameters {e.g., A and C) and noise 
statistics {e.g., < 7 ^ and CJ^). However, the adversary is limited 
to have only causal knowledge of the process noise and the 
sensor noise in good sensors (not attacked by the adversary). 


We discuss this assumption in more detail in Section II-C 


asymptotically m. In particular, the Kalman filter update rule 
can be written as: 

x(f -f 1) = Ax(f) +L(f) (y(f) - Cx(f)), (3) 

where x(f -f 1) is the state estimate at time f -f 1 and L(f) is 
the Kalman filter gain. For a Kalman filter in steady state ca, 
the steady state gain satisfies L(f) = L. Also, we use Popt,s to 
denote the trace of steady state (prediction) error covariance 
matrix na obtained by using a Kalman filter on a sensor 
subset s C {1,2,...pj. 

In contrast to the prediction problem, the goal in the state 
filtering problem is to estimate the state at time f based on 
outputs till time f. In the absence of sensor attacks, a Kalman 
filter update rule similar to can be used for the filtering 
problem ifTSll (see Appendix Qfor details) and we use /Vpr.s 
to denote the trace of steady state (filtering) error covariance 
matrix obtained by using a Kalman filter on a sensor subset s. 

C. Causal knowledge assumptions 

At time f, the attack vector ^(f) in ([T]| depends on the 
knowledge of the adversary at time f, and in this context, we 
limit the adversary’s knowledge of the process and sensor noise 
along the lines of causality. In particular, for the prediction 
problem we assume the following for a k-adversary: 

(Al) The adversary’s knowledge at time t is statistically 
independent of w(f') for t' > f, i.e., (I>{t) is statistically 
independent of {w(T)},/>,; 

(A2) For a good sensor d G {1,2,.. .p{ — K, the adversary’s 
knowledge at time t (and hence 0(f)) is statistically 
independent of {vd{t')}t'>t- 

Intuitively, assumptions (Al) and (A2) limit the adversary 
to have only causal knowledge of the process noise and the 
sensor noise in good sensors (not attacked by the adversary). 
Note that, apart from (Al) and (A2), we do not impose 
any restrictions on the statistical properties, boundedness and 
the time evolution of the corruptions introduced by the k- 
adversary. In the filtering problem, we replace assumptions 
(Al) and (A2) with (A3) and (A4) as described below: 

(A3) The adversary’s knowledge at time f is statistically 
independent of w(f') for t' > t, i.e., (I>{t) is statistically 
independent of {w(f')},/>,; 

(A4) For a good sensor d G {1,2,...p} — K, the adversary’s 
knowledge at time t (and hence 0(f)) is statistically 
independent of {vd{t')}t'>t- 

Clearly, (A3) is a stronger version of (Al), requiring 0(f) to 
be independent of w(f). Similarly, (A4) is a stronger version 
of (A2). 


D. Sparse observability condition 

B. State estimation: prediction and filtering 

For the matrix pair (A,C), the observability matrix O with 
In this paper, we address two state estimation problems; observability index p is defined as shown below: 

(1) state prediction and (2) state filtering. 

In the state prediction problem, the goal is to estimate 
the state at time f based on outputs till time f — 1. In the O = 

absence of sensor attacks, using a Kalman filter for predicting 
the state in Q leads to the optimal (MMSE) error covariance 


CA 


CA^'- 


(4) 






In this context, a linear dynamical system, characterized by the 
pair (A,C), is said to be observable if there exists a positive 
integer ji such that O has full column rank. In the absence 
of sensor and process noise, the conditions under which state 
estimation can be done despite sensor attacks have been 
studied in ijj), lH, 0. In particular, a linear dynamical system 
as shown in Q is called 0-sparse observable if for every subset 
s C {1,... p} of size 0, the pair (A, Cs) is observable (where Cs 
is formed by the rows of C corresponding to sensors indexed 
by the elements of s). Also, 9 is the smallest positive integer 
to satisfy the above observability property. The condition: 

9<p-2k, (5) 

is necessary and sufficient for exact state estimation against a 
k-adversary in the absence of process and sensor noise US ; we 
will refer to this condition as the sparse observability condition. 
We provide a coding theoretic interpretation for the same in 
Section |Vl] 

III. Main results 

We first state our achievability result followed by an 
impossibility result. 

Theorem 1 (Achievability): Consider the linear dynamical 
system defined in Q satisfying the sparse observability con¬ 
dition © against a k-adversary. Assuming (Al) and (A2), 
and a time window G = {fi, fi + 1,... fi + A — 1} for the state 
prediction problem, the following bound on the prediction error 
is achievable against a k-adversary. For any e > 0 and 5 > 0, 
there exists a large enough N such that: 

< r, ,(^opf,s) + e ) > 1-5, 

t^G sc{l,2,...p}, \s\—p—k J 

(6) 

where e(f) =x(f)—x(f) is the estimation error for the state 
estimate x(f). In other words, with high probability (w.h.p.), 

the bound limsup — V e^(f)e(f) < max {Popt,s) is 

N ffzQ SC{1,2,|s|=p-/: 

achievable. Similarly, for the state filtering problem, assuming 
(A3) and (A4) against a k-adversary, the following bound on 
the corresponding filtering error e(f) is achievable w.h.p.: 

limsup^ Ve^(f)e(f) < max (Fop,,s) • (7) 

TV^oo -fv sc{1.2....p}. |s|=p-i: 

The achievability in Theorem [T] is through our proposed 
algorithms, which we discuss in the following sections. The 
impossibility result can be stated as follows. 

Theorem 2 (Impossibility): Consider the linear dynamical 
system defined in Q and an oracle MMSE estimator that 
has knowledge of K, i.e., the set of sensors attacked by a 
k-adversary. Then, there exists an attack sequence (j>{t) such 
that the trace of the prediction error covariance of the oracle 
estimator is bounded from below as follows: 

tr (E (e(f)e^(f))) > Popt,s, (8) 

where e(f) above is the oracle estimator’s prediction error and 
s = {1,2,... p} — K". Similarly, for the filtering problem, 

fr(E(e(f)e^(f))) >Fop,_s. (9) 


Proof: Consider the attack scenario where the outputs 
from all attacked sensors are equal to zero, i.e., the corruption 
^j{t) = —cjx(f) —Vj{t), Vj G K. Hence, the information col¬ 
lected from the attacked sensors cannot enhance the estimation 
performance. Accordingly, the estimation performance from 
the remaining sensors is the best one can expect to achieve. 

Clearly, for the adversary’s best choice of K, the guarantees 
given in our achievability match the impossibility bound (in 
an empirical average sense), and hence, we consider our guar¬ 
antees optimal. We measure the performance of our proposed 
algorithms in terms of empirical average (and not expectation) 
since the resultant error in the presence of attacks may not be 
ergodic. 

IV. Secure state estimation: scalar state 

In this section, we illustrate the main ideas behind our 
general scheme in the simpler setting of estimating a scalar 
state variable against a k-adversary. In particular, we focus on 
the state prediction problem for the system in ([T]i when the 
state is a scalar and there are p > 2k-f 1 sensors (i.e., 1-sparse 
observability condition against k-adversary). For clarifying the 
presence of scalar terms in our analysis, we use the scalar 
version (regular instead of bold face) of the notation developed 
in Sectionjll} i.e., x{t) for the plant’s state, x{t) for the estimate, 
and yd{t) = Cdx{t) + Vd{t) for the output of a good sensor 
d G {1,2,. . .p} — JC. We first describe our proposed algorithm 
for a time window G = -I- 1,.. .fj +N— 1} of size N, and 

then analyze its performance. 

Secure scalar state prediction algorithm: Considering a 
time window G, Algorithm [T] shows the secure state prediction 
algorithm for the case when the state is a scalar. The algorithm 
runs a bank of Kalman filters in parallel; one Kalman 

filter associated with each distinct set of p — k sensors. For 
each distinct set s of p — k sensors, the corresponding Kalman 
filter fuses all the measurements from these sensors in order 
to calculate (prediction) estimate x^f). Using the calculated 
estimate Xs(f), we calculate the individual residues for each 
sensor as shown in The algorithm, then, exhaustively 

searches for the set s of p — k sensors which satisfy the 
residue test shown in •HD- If a set s* satisfies the residue test, 
it is declared good and the corresponding Kalman estimate 
Xs*(f) is used as the state estimate for the given time window. 
Intuitively, the residue test checks if the outputs from a given 
sensor set s are consistent with the corresponding Kalman 
estimate over the time window G. 

Performance analysis: Consider the set s of p — k 
sensors which are not attacked by the k-adversary. Assuming 
that the Kalman filter corresponding to set s is in steady state, it 
can be shown that E (^^(f)) = c^op,,.+(yl\/dGsm (where 
residue r^it) is as defined in ([TOll). For large enough N, due to 
the (strong) law of large numbers (LLN), the residue test will 
be satisfied w.h.p. for at least this set of good sensors. This 
ensures that w.h.p., the algorithm will not return an empty set. 
Also, the estimate Xs(f) from this set of good sensors trivially 
achieves the error bound But, since the algorithm can 
return any set of size p — k which satisfies the residue test, 
it may be possible that some of the sensors in the returned set 
are corrupt. In the remainder of our analysis, we show that for 
any set returned by the algorithm, the corresponding Kalman 
estimate achieves 


Algorithm 1 Secure State Prediction - scalar case 
1: Enumerate all sets s G S such that: 

S = {s|s C {1,2,.../?}, \s\=p-k}. 

2: For each s G S, iTin a Kalman filter that uses all sensors 
indexed by s and returns estimate Xs{t) G K. 

3: For each s G S, calculate the residues for all sensors d Gs 
over a time window G = + 1,.. .fi +A — 1} as: 


where (a) follows from the independence of e(f) from Vd(t) 
(due to assumption (A2), Xs{t) is independent of good sensor 
noise Vd{t) despite sensor attacks). Also, using and taking 
the expectation in ( [T3] l: 

E ( ^ ) <'f’opf,s + ^- (16) 

tec j 


rd{t) =yd{t)-Ccth{t) V^fGs, VfGG. (10) 

4: Pick the set s* G S which satisfies the following residue 
test: 

t; 52 Vrf G s*, (11) 

teG 

where e > 0 is a design parameter and can be made 
arbitrarily small for large enough N. 

5: Return s* and x{t) := Xs*{t) Vf G G. 


Suppose the algorithm returns a set s of /? — k sensors. 
There is definitely one good sensor (say sensor d) in this set 
because there can be at most k attacked sensors and p — k>k. 
Since the residue test is satisfied for this sensor, we have the 
following constraint: 

- Y E (cdx(t) + Vd(t)-CdMt)f 

= 4 E icdeit) + Vdit)f 

teG 

= ^ E ^"(0+^ E ^dit )+^ E 

teG teG teG 

(b) 

— CdPopt,s + (Ty + e, (12) 


where (a) follows from ydit) = Cdx{t) +Vd{t) for a good sensor 
d and (b) follows from the residue test. The error e(f) above is 
the state estimation (prediction) eiTor at time t (in the presence 
of a k-adversary) when Xs(f) is used as the state estimate. Using 
LLN, we can make an additional simplification as follows. For 
any e > 0, there exists a large enough N such that: 




2cd 


teG 


N 


(«) , 

< CdPopt,s + 


teG 

1 


CT,t- Y 

N ^ 

teG 




{b) , 

+ £ ^ CdPopt,s~‘r2.£, 


(13) 

(14) 


where (a) follows from ( [T2| , and (b) follows w.h.p. due to 
LLN. Our next step will be to show that the cross term 
111 •HD* is vanishingly small w.h.p. as A—>■ 0 °; 
this leads to the required bound on ^ LrGG^^(0 using ( [l4| ). We 
do so in two steps: first we show that the mean of the cross 
term ^L(GG^(0i'rf(0 is zero and then show that its variance 
is vanishingly small as A — 0 °. 

The mean of the cross term ^L?GG^(0''d(0 can be 
computed as shown below: 


E 



Y^e{t)vd{t) 

teG 


= ^EE("(0)E(v'.(0)=O, (15) 

teG 


As the final step in our analysis, we will now show that the 
variance of cross term ^T,teG^(.^)'^d{t) is vanishingly small 
as A —>^ 00 . For any £1 > 0, there exists a large enough A such 
that: 



77 E^(0i'fl'(0 


teG 


LteG^{eHt)yM 


7^2-+ 7^ E E(e(f)vrf(0e(f')vd(f')) 

t.t'eG,t<t' 


teG 


t,t'eG,t<t' 

^ ^LteGeHt)\t^ 


= ^E 
A 


A 


< ei, 


(17) 


where (a) follows from the independence of e(f) from Vd{t) 
and the independence of Vd{t') from e{t)vd{t)e{t') (for t' > t), 
(b) follows from ( [T^ . The above result implies that the cross 
term ^ LrGGc(0''£/(0 (with zero mean) has vanishingly small 
variance as A —7 0 °. As a result, using Chebyshev’s inequality 
and ( [T 4 I 1 , we have the error bound •iji. 


V. Secure state estimation: vector state 

In this section, we consider the state estimation problem 
(against a k-adversary) for the general linear dynamical system 
described in Q, when the state is a vector. We focus on 
the prediction problem in this section; the filtering problem 
is studied in Appendix We assume that the system is 
0-sparse observable such that it satisfies the sparse observ¬ 
ability condition Q against a k-adversary. We first introduce 
some additional notation required for our proposed algorithm. 

Additional notation: Consider a set s of /? —k sensors. 
Such a set has (^g^) sensor subsets of size 0, and we index 
these subsets of s by i. Due to the 0-sparse observability 
condition, each subset i forms an observable pair (A,C,) 
with observability matrix O, and observability index /£,; C, 
is formed by rows of C corresponding to subset i of s. We 
define matrices J; and M^. as shown below: 


0 

0 

0 


C, 

0 

. 0 


C,A 

C; 

. 0 

,M^, = (72J;Jr + (^v%, 


C,A^'-3 . 

. C,, 



The pseudo-inverse of O,- is denoted by The output from 
sensor subset i (of size 0) at time t is denoted by y,(f) G K®. 
We consider the state estimation problem for a time window G 











Algorithm 2 Secure State Prediction - vector case 
1: Enumerate all sets s G S such that: 

S = {s|s C {1,2,.../?}, \s\=p-k}. 

2: For each s G S, iTin a Kalman filter that uses all sensors 
indexed by s and returns estimate Xs(f) G K”. 

3: For each set s G S, enumerate all subsets of size 9 
and index them by i. Fet /i; be the observability index 
associated with sensor subset i. For each subset i of s 
(subset of size 0), calculate the block residue: 


rKO = 


y;(0 
y/(f + i) 


—o,xs(f) yt^G. 


Jlit + ~ 1 ). 


4: Pick the set s* G S which satisfies the following block 
residue test for each subset i of s* (subset of size 0). 
Partition G into /i,- groups Go, Gi,... Gfi- _ i of size such 
that Gi = {t\{{t — t\) mod /i,) = 1} and check that for each 
Gf. 

< Pop,,.*+tr +e, (19) 

where e > 0 is a design parameter which can be made 
arbitrarily small for large enough Nb- 
5: Return s* and x(t) := its*(t) Vf G G. 


of size A and assume without loss of generality that jii divides 
A such that /i,Ab = A. 

Secure state prediction algorithm: Similar to the scalar 
setting. Algorithm S runs a bank of Kalman filters 

in parallel. For each distinct set s of p — k sensors, the 
corresponding Kalman filter fuses all the measurements from 
these sensors in order to calculate an estimate Xs(f). For a 
sensor set s of size p — k to satisfy the block residue test, 
each of its (^q^) subsets should satisfy ( [T^ for each group 
G;. If a set s* satisfies the residue test, it is declared good and 
the corresponding Kalman estimate Xs*(f) is used as the state 
estimate for the given time window. Intuitively, the residue 
test checks if the outputs from every observable sensor subset 
of size 0 within set s are consistent with the corresponding 
Kalman estimate over the time window G. We analyze the 
performance of Algorithm in Appendix 

VI. Sparse observability: Coding theoretic view 

In this section, we revisit the sparse observability condition 
^ against a k-adversary and give a coding theoretic interpre¬ 
tation for the same. We first describe our interpretation for a 
linear system, and then discuss how it can be generalized for 
non-linear systems. 

Consider the linear dynamical system in Q without the 
process and sensor noise (i.e., x(f-|- 1) —Ax{t), y(f) =Cx(f)-|- 
(j>{t)). If the system’s initial state is x(0) G K" and the system 
is 0-sparse observable, then clearly in the absence of sensor 
attacks, by observing the outputs from any 0 out of p sensors 
for n time instants (f = 0,1,.. .n — 1) we can exactly recover 


x(0) and hence, exactly estimate the state of the plant. A coding 
theoretic view of this can be given as follows. Consider the 
outputs from sensor d G {l,2,.../7} for n time instants as a 
symbol 3^^ G ffi". Thus, in the (symbol) observation vector 
y = [3^1 3^2 ■■■ 3 ^p], due to 0-sparse observability, any 0 

symbols are sufficient (in the absence of attacks) to recover 
the initial state x(0). Now, let us consider the case of a k- 
adversary which can arbitrarily corrupt any k sensors. In the 
coding theoretic view, this corresponds to arbitrarily corrupting 
any k (out of p) symbols in the observation vector. Intuitively, 
based on the relationship between error correcting codes and 
the Hamming distance between codewords in classical coding 
theory m, one can expect the recovery of the initial state 
despite such corruptions to depend on the (symbol) Ham¬ 
ming distance between the observation vectors corresponding 
to two distinct initial states (say x*^^)(0) and x*^^)(0) with 
x(*)(0) 7^x*^^^(0)). In this context, the following lemma relates 
0-sparse observability to the minimum Hamming distance 
between observation vectors in the absence of attacks; this 
leads to a (tight) bound on the number of attacked sensors 
that can be tolerated for state estimation. 

Lemma 1: For a 0-sparse observable system with p sen¬ 
sors, the minimum (symbol) Hamming distance between ob¬ 
servation vectors corresponding to distinct initial states is 
/ 2-0 + 1 . 

Proof: Consider observation vectors 3^^*) and 3^^^^ cor¬ 
responding to distinct initial states xf*^(0) and xf^)(0). Due 
to 0-sparse observability, at most 0 — 1 symbols in 3^^^^ 
and 3 ^( 2 ) can be identical; if any 0 of the symbols are 
identical, this would imply x(*)(0) = x(^)(0). Hence, the 
(symbol) Hamming distance between the observation vectors 
3^^*) and (corresponding to x(*)(0) and x(^)(0)) is at 
least p — (9 — l)=p—9 + 1 symbols. Furthermore, there 

exists a parr of initial states ^x^'^(0),x*^^)(0)j, such that the 

corresponding observation vectors 3^^*^ and 3^^^) are identical 
in exactly 0 — 1 symbol^ and differ in the rest p — 9 + 1 
symbols. Hence, the minimum (symbol) Hamming distance 
between the observation vectors is /? — 0 -G 1. ■ 

The above lemma connects the problem of state estimation 
with sensor attacks in a dynamical system to error correction in 
classical coding theory. Since the minimum Hamming distance 
between the observation vectors corresponding to distinct 
initial states is p — 9 + 1, we can correct up to k < '’ 2 ^^ 
sensor corruptions; this is equivalent to the condition 0 < 
p — 2k, which is precisely the sparse observability condition 
required against a k-adversarj0 It should be noted that a k- 
adversary can attack any set of k (out of p) sensors, and 
the condition both necessary and sufficient for 

exact state estimation despite such attacks. When k > ^ ■> 

it is straightforward to show a scenario where the observation 
vector (after attacks) can be explained by multiple initial states, 
and hence exact state estimation is not possible. The following 


Tf there is no such pair of initial states, the initial state can be recovered by 
observing any 0 — 1 sensors. By definition, in a 0-sparse observable system, 
0 is the smallest positive integer, such that the initial state can be recovered 
by observing any 0 sensors. 

^In addition, since the minimum Hamming distance is p —0 + 1, we can 
detect attacks up to (p—0 + 1) — l=p—0 sensor coiTuptions. 











example illustrates such an attack scenario in view of the 
coding theoretic interpretation discussed above. 

Example 2: Consider a 0-sparse observable system with 
9=2, number of sensors p = 5, and a k-adversary with k = 
2. Clearly, the condition k < is not satisfied in this 

example. Let x(*)(0) and x(^)(0) be distinct initial states, such 
that the corresponding observation vectors and 3^^^^ have 
(minimum) Hamming distance p — 0 +1 =A symbols. Figure 
depicts the observation vectors and and for the sake 
of this example, we assume that the observation vectors have 
the same first symbol (i.e., 3^1'^ = 3^p^ = 3^i) and differ in 
the rest 4 symbols (hence, a Hamming distance of 4). Now, as 
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Fig. 1. Example with 9 — 2, p —5 and k — 2. For distinct initial states x^^^(O) 
and x^^^(O), the corresponding observation vectors are and Both 

3 ^^^^ and 3 ^^^^ have the same first symbol, but differ in the rest four symbols. 
Given (attacked) observation vector 3^ — 1^ 3^1 3 ^ 2 ^^ 3 ^ 3 ^^ 3^^^^ 3^g^^j, there are 
two possibilities for the initial state: (a) x^^^(O) with attacks on sensors 4 and 
5, or (b) x(^^(0) with attacks on sensors 2 and 3. 
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Appendix 

A. Algorithm performance analysis 


shown in Figure suppose the observation vector after attacks 


was y = 


Vi 34‘ 

possible explanations for this (attacked) observation vector: (a) 
the initial state was x^'^O) and sensors 4 and 5 were attacked, 
or (b) the initial state was x(^)(0) and sensors 2 and 3 were 
attacked. Since there are two possibilities, we cannot estimate 
the initial state exactly given the attacked observation vector. 
This example can be easily generalized to show the necessity 
of the condition k < 


32 ( 1 ) ym yp 


. Clearly, there are two 


P-e+i 


For (noiseless) non-linear systems, by analogously defining 
0 -sparse observability, the same coding theoretic interpretation 
holds. Hence, this leads to an alternative proof for the neces¬ 
sary and sufficient conditions for secure state estimation in any 
noiseless dynamical system. 
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In this section, we analyze the performance of Algorithm [2 


Similar to the analysis done for the scalar setting in Section IV 


we first derive a bound using LLN, and then analyze the 
cross term in the bound to obtain final guarantees on the state 
estimation error in the presence of attacks. The details of the 
analysis are described below. 


Consider the set s of p — k sensors which are not attacked 
by the k-adversary. For such a set s, the block residue r,(f) for 
a subset i of s (subset of size 0) can be expressed as shown 
below: 
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and assuming that the Kalman hlter corresponding to sensor 
set s is in steady state: 

E(fr(o;r,(f)rr(f)Or)) (o^M^,Or) , 

where (a) follows from M^. = E = 

CTjyJi Jf + - Hence, due to LLN, the block residue test ( [T^ 

will be satisfied w.h.p. for at least this set of good sensors and 
w.h.p. the algorithm will not return an empty set. Also, the 
estimate Xs(f) from this set of good sensors trivially satisfies 
the error bound (|^. But, since the algorithm can return any 
set of size p — k which satisfies the block residue test, it may 
be possible that some of the sensors in the returned set are 
corrupt. In the remainder of our analysis, we show that for 
any set returned by the algorithm, the corresponding Kalman 
estimate achieves the error bound (|^. 

Suppose the algorithm returns a set s of p — k sensors. 
Since 9 < p —2k (sparse observability condition), there exists 
a subset of 9 good sensors in s. The following can be inferred 
when the block residue test is satisfied for such a subset 
i (of size 9): 


- kT L + L f''(okv:r+M,-i<cr+M,-iOr) 

+ E e^(f)0;Z,-„+^,_i 

teG, 

W / J. J.rp\ 

^ Popt.s + tf [O-Mji.O'j j+e, (22) 

where e(f) in (a) is the state estimation error at time f (in 
the presence of a k-adversary) when Xs(f) is used as the state 
estimate, and (b) follows from the block residue test 
Using (a) and (b) above, for any £ > 0 there exists a large 
enough Nb such that: 
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teGi 
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teGi 


(23) 
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fr(o;M^,Or) 
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teGi 


+ Popt,s + £ 

(c) 

<^opr,s+2£, (24) 


where (c) follows w.h.p. from LLN; for different time indices 
in G;, tr corresponds to i.i.d. re¬ 

alizations of the same random variable. Along the lines of the 
analysis done in the scalar setting in Section IV we can show 
that the cross term ^'LteGi^^iO^i^ip-t+tii-i in <1 ^ has zero 
mean and vanishingly small variance as Nb this leads 

to the required bound on To complete our 

analysis we calculate the mean and the variance of the cross 
term Lrec, e^(f)Ojz,'as shown below. 


The mean of ^ Lee, 


1 can be computed 


as shown below: 

' 2 


® 1 E it)o7i^f.t+p,- 

\^B t^Gi 

(y) 

“ Nb 


^ E (e^(f)) E (o}z,y,+^,._i) = 0, (25) 

teGi ^ ^ 


where (a) follows from the independence of e(f) from 
Zi^f.t+pi-i- This is true since both x(f) and Xs(f) are inde- 
pendenj^ of w(f) and v,(f). Also, using ( |25l l and taking the 
expectation in 


ivT E^^WeW 1 < Popt,s + 2e. (26) 


Now, we will show that the variance of the cross term 
^ 'LteGi is vanishingly small as Nb °°. For 

any £1 > 0, there exists a large enough Nb such that: 

E e^(0OjzA:r+/i,-lj 
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^ tr (^Ojzi^f,t+pi-i (oJz,y:,+^,_i) e(f)e^(f)^ 
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^The adversary’s corruptions till time t—\ can influence Xs(/) which is 
based on outputs till time t — 1. Due to assumption (Al), the adversary’s 
corruptions till time t — \ are independent of w(r) and hence Xs(/) is 
independent of w(?). Also, x(t) is independent of w(t). Due to assumption 
(A2), Xs(?) is independent of v/(r). 
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teG, 

Nb \NB,k, J 
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< ei, (27) 

where (a) follows from ( |25l l, (b) follows from the 

independence of z,-from e^(f)0[z;^,:,+^._ie^(f')0j 
for f' > f, (c) follows from E (z,y.,/^^._j) = 0 , (d) follows 
from e^(f)Ojz;,:f+^__i being a scalar, (e) follows from 
the independence of z,y:,+^._i from e(f), (f) follows 

from Lemma (discussed in Appendix 0 with eigen 

value X* = Xmax(E(ojzi^rJ+Bi-l{o]^i^t■.t+^li-l^ = 

Xmax (i.e., X* is the maximum eigen value 

of Finally, (g) follows from This 

completes the variance analysis and clearly the cross term 
^ ILteGi has vanishingly small variance as 

Nb —> °o. As a result, using Chebyshev’s inequality and ( |24| ), 
we have the following bound: for any £2 > 0 and 5 > 0, there 
exists a large enough Nb such that: 

TT Ee^(0e(0<7’<,pr.s + e2 ) >1-5. (28) 

>^G, ) 

Since ^L^ec,e^(0e(f) <-Popf.s + £2 VI S {0,1,.. - 1} 

implies LrGG^^(0^(0 < + £ 2 . we have the required 

bound on ^ LreG®^(0®(0 from ( |28l l as follows. For any £2 > 0 
and 5 > 0, there exists a large enough N such that: 

^(^^Ee^WeW<^opf.s + £2^ >1-5. (29) 

This completes our performance analysis. 

B. Bounds on the trace of product of symmetric matrices 

A useful lemma from HI providing bounds on the trace 
of product of symmetric matrices is as follows. 

Lemma 2: If A and B are two symmetric matrices in K"^", 
and B is positive semi-definite {i.e., B > 0), then the following 
inequality holds: 

Xnin (A)fr(B) < fr(AB) 

^max (A)fr(B). (30) 

where Xmin (A) and Xmax (A) denote the minimum and maxi¬ 
mum eigen values of matrix A. 

C. Secure state filtering 

In this section, for the general linear dynamical system 
defined in 0. we study the filtering problem where the goal 
is to estimate the state at time t based on outputs till time t 
(in contrast to using outputs till time f — 1 in the prediction 


Algorithm 3 Secure State Filtering - vector case 

I: Enumerate all sets s € S such that: 

S = {s|s c {l,2,...p}, |s|=p-^}. 

2: For each s G S, run a Kalman filter that uses all sensors in¬ 
dexed by s. The corresponding Kalman (filtering) estimate 
is denoted by Xs(0 G R". 

3: For each set s G S, enumerate all subsets of size 0 
and index them by i. Let /r,- be the observability index 
associated with sensor subset i. For each subset i of s 
(subset of size 0), calculate the block residue: 


L'(f) = 


y;(0 
y/(f + i) 


—OiXs(f) yt^G. 


yi{t + iii-l)_ 


4: Pick the set s* G S which satisfies the following block 
residue test for each subset i of s* (subset of size 0). 
Partition G into /r,- groups Gq, Gi,... G^,_i of size Nb such 
that Gi = {t\{{t — t\) mod pi) = 1} and check that for each 
Gf. 


LE,,(o;r,(0rr(0or) 

<Fopt,. + tr 

-2E (vf (f)Lf0jv,v,+^,_i) -f £, (32) 


where £ > 0 is a design parameter which can be made 
arbitrarily small for large enough Nb- 
5: Return s* and x(t) :=Xs*(t) Wt G G. 


problem). In the absence of sensor attacks, using a Kalman 
filter for state filtering in ([^l leads to the optimal (MMSE) 
error covariance asymptotically lfT2ll . The Kalman filter update 
rule (in steady state) for the filtering problem (without sensor 
attacks) is as shown below: 

x(f ) = x(^) (f ) -f L (y(f ) - Cx(^) (f )) , (f + 1) = Ax(f) , 

(31) 

where x(f) is the state (filtering) estimate (see ifT^ for further 
details). The filtering error is defined as e(f) =x(f)—x(f), and 
as shown in ( [3T] i, the state estimate x(f) at time t depends on 
the outputs at time t. Also, in the absence of sensor attacks, 
Fopt.s is the trace of steady state (filtering) error covariance 
matrix obtained by using the Kalman filter on a sensor subset 
sC{1,2,.../7}. 

Eor the secure state filtering problem, we assume that 
sparse observability condition (|^, and assumptions (A3) and 
(A4) hold against a k-adversary. In addition to the notation 
developed in Section for the prediction problem, we will 
require the following definition: L, G denotes the matrix 
formed by columns of L corresponding to sensor subset i of set 
s (subset i is of size 0). The algorithm for secure state filtering 
(and its analysis) is similar to that for the prediction setting. 
In the remainder of this section, we first describe the secure 
state filtering algorithm and then analyze its performance. 

Secure state filtering algorithm: Algorithm shows 
the secure state filtering algorithm against a k-adversary. It is 







same as Algorithm [^except for the usage of Kalman (filtering) 
estimate and the bound used for the block residue test (HD); 
Fopt,s is used instead of Popt.S and there is an extra term 

-2E (vf (f)LfOjv,V:,+^._l) (where yift+p -\ is as defined in 

@). 

Performance analysis: The performance analysis is 
similar to the analysis done for the prediction problem in 
Appendix and we describe the details below. 

Consider the set s of p — k sensors which are not attacked 
by the k-adversary. For such a set s, the block residue r, (f) for 
a subset i of s (subset of size 9) can be expressed as shown 
below; 


r;(f) = O; {x{t) -Xs{t)) 


(33) 


and assuming that the Kalman filter corresponding to sensor 
set s is in steady state, it can be shown that: 


E(fr(oJrKf)rr(f)Oj^)) 


= Fr, 


-fr(0jM^,0") -2E(vf(f)Lfo;v,-,„+M,_i) ■ 

(34) 


Opt,S ' 


Hence, due to LLN, the block residue test ( [32l i will be satisfied 
w.h.p. for at least this set of good sensors and w.h.p. the 
algorithm will not return an empty set. Also, the estimate Xs(f) 
from this set of good sensors trivially satisfies the error bound 
0. But, since the algorithm can return any set of size p — k 
which satisfies the block residue test, it may be possible that 
some of the sensors in the returned set are corrupt. In the 
remainder of our analysis, we show that for any set returned 
by the algorithm, the corresponding Kalman estimate achieves 
the error bound 0. 

Suppose the algorithm returns a set s of p — k sensors. 
Since 9 < p —2k (sparse observability condition), there exists 
a subset of 9 good sensors in s. The following can be inferred 
when the block residue test is satisfied for such a subset i (of 
size 9): 


if J_ 
“ Nb 
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tea, 


(35) 


-2E (\J +e, 

where e(f) in (a) is the state estimation error at time f (in 
the presence of a k-adversary) when Xs(f) is used as the state 
estimate, and (b) follows from the block residue test ( |3^ . 
Using (a) and (b) above, for any e > 0 there exists a large 
enough Nb such that; 

^ E E e^(f)Ojzu:r+,i,-i (36) 

tea, ^B ,^G, 

^ Fnnt.fi 
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tr 
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where (c) follows w.h.p. from LLN as for different time 
indices in Gi, corresponds 

to i.i.d. realizations of the same random variable. Simi¬ 
lar to the prediction problem, it can be shown that the 
cross term ^'LteG ,^^in has mean 

-2E(vf(f)Lfo;v,-,„+p,_i) and has vanishingly small vari¬ 
ance as Nb This leads to the claimed bound 0 on state 
estimation error and we describe the details below. 

For simplifying our calculations, we introduce the term 
e(f) = e(f) -f L,■¥,(?). Due to assumptions (A3) and (A4), 
e(f) is independent from w(f) and v,(f), and hence inde¬ 
pendent from z, ,:r+p._i. Now, the mean of the cross term 
i,t:t+pi-i can be computed as follows: 

= E (E(e^(0Ojz,-„+p,_i)-E(vf(f)LfoJz,-,^^^^^^^ 

-2E (vf (t)LjO^Zi,t:,+p,-i) , (38) 

where (a) follows from the independence of e(f) from 
Zi^f.t+pi-i- Also, using (|3^ and taking the expectation in (|36l): 


® ^ ^ (0^(0 J ^ Fgp, s + 2e. 


(39) 


We now state the following claim which is 
useful in our variance calculation for the cross term 

] 4 lfGG,e^(f) 0 ]‘z,V:,+p,_i. 

Claim 1: Consider a subset i of s (subset of size 9) 
which satisfies the residue test ( [3^ in Algorithm 1^ With 

= e(f)-f L,■¥,(?), the following holds: 


e(f) = 




— E E(e^(f)e(f)) <771, 
^^B teG, 


(40) 


where rji is a constant. Furthermore, Vd G {l,2,...n}, 
]^LteG,^(Bd(t)) < V 2 where 772 is a constant. 

Proof: See Appendix [P] ■ 

Now, we will show that the variance of 
-^'LteG ,^^is vanishingly small as Nb ^ °°. 
For any ei > 0, there exists a large enough Nb such that; 
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where (a) follows from the independence of 
e^(f)0[z,'^;:(+^,._ie^(f') from z,- ,/.,/_|_^._i, (b) follows from the 
independence of e(f) from Ojz;,:f+^._ivf (f)Lf0.z,^,:,+^._i, 

(c) follows from the independence of e(f) from 0.z,j:,+^.^i, 

(d) follows from Lemma (see Appendix E with 

A* = Xmax(^(o]zi^,,t+^.^i and (e) 

follows from Claim [T] 


The above result implies that the variance of the cross term 
^ 'LteG, e^(f)Ojz/,r:f+M,-i Vanishingly small as Nb As 
a result, using Chebyshev’s inequality and ( |J7l i, we have the 
following bound; for any £2 > 0 and 5 > 0, there exists a large 
enough Nb such that: 

P 1 ivT L e^(0e(f) < A;pr.s + £2 ) > 1 - 5. (42) 

,^G, J 

Since ^LteG,^^(tMt) < Fopt,s + £2 VI G {0,1,.. .^1; - 1} 
implies ^ Fopt,s + £ 2 , we have the required 

bound on ^ L?gG^^( 0®(0 from ( |4^ as follows. For any £2 > 0 
and 5 > 0, there exists a large enough N such that; 

p|^^Le^(f)e(f)<A’op,,s + £2^ >1-5. (43) 


This completes our performance analysis. 


D. Proof of Claim 
Using ( |^ : 

^ ^ E(e^(f)e(f))+E(vf(f)Lf(f)L,v;(f)) 

,eG, 

^ Fopt,s + 2£ • (44) 


The above result implies that ^ LteG, E (^^(0^(0) i^ bounded 
by a constant. This also implies that ■^LteG,^{^d{,t)) is 
bounded by a constant Vt/ G {1,2,...«} as shown below: 
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< i + TT E 


Nb 

1 


teG, 


^l + AT EE(e^We(0). 

^B teG, 


(45) 


where (a) follows from Jensen’s inequality. This completes the 
proof of Claim 











